High Performance Computing: converging or diverging?
Will there still be generic High Performance Computing (HPC) experts in the near future or will each HPC platform require its own expertise? That is the fascinating question that is raised in Robert Roe’s (twitter @ESRobertRoe) article “Next-generation software: who will write it?“, in the October/November-issue of Scientific Computing World magazine.
There has always been a tension between portability and performance in the world of High Performance Computing. A new hardware paradigm would always pop up even before you had optimized your software for a specific platform. So the best you could do was to use some generic platform that would work on basically any hardware, accepting the fact that it would never use the hardware to the max.
Nowadays, there is a broad spectrum of hardware architectures existing side-by-side. There are the shared memory (multi-core) architectures that are usually programmed using OpenMP or multi-threading. Then we have the distributed memory architectures for which MPI is the logical choice. Besides, there are the accelerators like GPUs, that are usually addressed in CUDA or OpenCL. Most present-day computers employ a combination of these architectures. And then we’re not even considering hybrid FPGA or ARM platforms…
All this time, there was hope that someday we would have a dominant programming model again that would let us develop portable software. The only question was which programming model that would be and when it would become the dominant one.
The article in Scientific Computing World now suggests that High Performance Computing technology may not converge any time soon. That we will be stuck with different technologies existing side-by-side, each with a different programming model, development tools and dedicated experts. That each technology will be necessary to address the various forms of parallelism that may exist in an application.
Those who want to build portable software in this new world, will either have to make different versions for different architectures, or will have to create software that exploits various paradigms within one code. That may be a bit too much to ask from a single expert, so you may be needing different experts for different parts of the code.
Perhaps the escape is in libraries that offer an extra layer of abstraction (like Paralution). Or maybe we can leave more to the compiler, that might generate an optimized version of a program from a high-level specification. But this is a dream that has failed to come true for more than 20 years. In reality, the scientific software developers of VORtech still have to rub the optimizations in the face of the compiler before it will understand what it is supposed to do.
So, developing high performance software may not get easier in the years to come, but will instead become more complex as hard- and software architectures further diverge.