Scientific programmers are accustomed to expressing in their programs the who (variable declarations) and the what (operations), in some sequentialized order, and leaving to the systems software and hardware the questions of when and where. This act of delegation is appropriate at the small scales, since programmer management of pipelines, multiple functional units, and multilevel caches is presently beyond reward, and the depth and complexity of such performance-motivated architectural developments are sure to increase. However, disregard for the differential costs of accessing different locations in memory (the flat memory model) can put unnecessary amounts of synchronization and data motion on the critical path of program execution. Different organization of algorithms leading to mathematically equivalent results can have very different levels of exposed synchronization and data motion, and algorithmicists of the future will have to be conscious of and adapt to the distributed and hierarchical aspects of memory architecture.
Document type: Part of book or chapter of book
The URL or file path given does not exist.
The different versions of the original document can be found in:
Are you one of the authors of this document?