Continuous enhancement in hardware technologies enables scientific computing to advance incessantly and reach further aims. Since the start of the global race for exascale high-performance computing, massively-parallel devices of various architectures have been incorporated into the newest supercomputers, leading to an increasing hybridization of compute nodes. In this context of accelerated innovation, software portability and efficiency become crucial. Traditionally, scientific computing software development using mesh methods is based on calculations in iterative stencil loops over a discretized geometry--the mesh. Despite being intuitive and versatile, the interdependency between algorithms and their computational implementations in stencil applications usually results in a large number of subroutines and introduces an inevitable complexity when it comes to portability and sustainability. An alternative is to break the interdependency between the algorithm and its implementation, and then to cast the calculations into a minimalist set of kernels. Algebra-based implementations rely on a reduced set of basic linear algebra subroutines, which simplifies the deployment of software in hybrid computing systems. In this work, we tackle the development of a fully-portable, algebraic library that can be coupled beneath other high-level, algebra-oriented framework. Namely, this library provides platform portability in the simplest possible manner (i.e., the user develops applications in a purely sequential style). Internally, algebraic objects are distributed among computing devices using a multilevel decomposition approach. Data exchanges between computing units or between nodes are hidden by a multithreaded overlapping scheme.
Are you one of the authors of this document?