The dramatic strides in computer speed and performance over the last few decades make it feasible to accurately model increasingly complex phenomena. However, achieving high performance on massively parallel supercomputers is an extremely challenging task. With deepening memory hierarchies, significantly higher degrees of per-chip multi-core parallelism, the task of programming compute-intensive engineering applications to attain high performance on a large scale cluster system has become increasingly difficult. It is often the case that the time and effort required to develop effective and efficient software has become the bottleneck in advancing many areas of science and engineering. This challenge can be overcome by advances in compile-time/runtime systems that can ease the burden on the programmer while delivering a high performance portable instantiation of the particular application on modern and emerging high performance platforms.
To address this challenge, this project is developing a novel framework for transforming irregular scientific/engineering applications in a global address space framework. The research is grounded in a very different and complementary research direction to most current efforts in addressing the challenge of enhancing programmer productivity, maintaining portability, and achieving good performance on scalable distributed-memory parallel systems. The project will advance compiler/runtime techniques so that users can develop annotated sequential programs, to be automatically transformed by our system for efficient execution on distributed-memory parallel systems. This approach is motivated by the success of the popular OpenMP and OpenACC pragma based approaches to transforming annotated sequential programs for parallel execution on multicore and GPU/accelerator systems, respectively. An annotation based OpenAPP (APP – Asynchronous Partitioned Parallelism) framework is proposed for source-to-source transformation of an important class of scientific/engineering programs using the inspector/executor paradigm for execution on distributed-memory parallel systems. The proposed framework will be validated using several medium to large scale applications.
The project seeks to significantly lower the entry barrier associated with effective use of scalable distributed-memory computers, which are essential if more than 100x performance improvement over sequential codes is sought. A successful outcome of this project will be transformative for computational and domain scientists and engineers who seek to use next generation parallel systems for their simulation and modeling. The developed tools will be made publicly available to the community under an open source license. The project will also organize workshops that bring together compiler/runtime experts and computational scientists developing massively parallel scientific/engineering applications.