AMD Logo AMD Developer Central

HPC Parallel Programming Models
Skip Navigation LinksHome > Tech Zones > HPC Zone > HPC Parallel Programming Models

Overview
The common theme in the evolution of HPC systems is parallelization. There are several parallel programming models that are directly associated with a hierarchy of different computing technologies. These are SIMD (more formally vector-based) programming, multi-process and multi-threading, and message passing.

» Single Instruction Multiple Data (SIMD) on Single Processors
» Multi-Process and Multi-Threading on SMP Computers
» Message Passing Interface and Clusters

Single Instruction Multiple Data (SIMD) on Single Processors
Historically at the lowest level of the hierarchy, parallelization first took an inside-out approach. Specifically this was the use of a vector-based Instruction Set Architecture (ISA) to facilitate parallel operations on multiple data elements, for example as implemented in the classic Cray-1 supercomputer in the 1970s. These instructions for example could add two vectors together (think arrays of floats or doubles), or scale a vector. This meant the development of vectorizing compilers, classically for the Fortran language. Today, the AMD64 ISA includes a variety of Single Instruction Multiple Data (SIMD) instructions that can be utilized by a variety of vectorizing compilers from many vendors. These include compiler tool chains such as PGI Workstation, the PathScale Compiler Suite, and Sun Studio. Using this requires some knowledge of the problem domain to understand which pieces of source code best lend themselves to this type of parallelization.

Multi-Process and Multi-Threading on SMP Computers
However, what most people think of with parallelization is the use of multiple CPUs on a single computer. This is the next level up in the parallelization hierarchy, classically multiple CPUs on a single computer and typically with a shared memory architecture. Primarily, the operating system manages this and many applications have some awareness of the use of multiple processors and multiple threads to do the parallelization of their computations. Again, some compilers have the capability to automatically create and manage threads on certain portions of source code, and this is generally called auto-parallelization. Programmers can also explicitly thread their code using threading APIs provided by the operating system, but this is undesirable because of the difficulties of writing explicitly multi-threaded code. An intermediate approach that relieves a lot of the difficulties of explicit multi-threading is OpenMP. OpenMP is an API requiring compiler support, and a combination of compiler directives and explicit calls can be used to create threads when applicable. Again the best use of OpenMP requires some domain knowledge, yet in many cases there are codes that parallelize quite well with relatively little effort.

As with SIMD, HPC-centric tools such as PGI Workstation, the PathScale Compiler Suite, and Sun Studio all support auto-parallelization and OpenMP.


Message Passing Interface and Clusters
The third level of the HPC hierarchy is distributed computing, typically networked multiple Shared Memory Processor (SMP) systems, assembled as collections of systems located fairly close to each other and often known as clusters. The dominant solution here is a library known as MPI, the Message Passing Interface. MPI is a complete programming model and there are a variety of open source and commercial vendors. It is typically not directly a part of any particular compiler. MPI can work quite well on an SMP system but its main capability is the ability to communicate with various nodes in an HPC cluster over the network interconnect, launch threads on those nodes, and distribute out data to be computed, and pull back in the results.

A hybrid approach of OpenMP and MPI can be effective as well. Combined with compilers that support auto-parallelization of threads and vectorization SIMD means that all levels of parallelization can be successfully implemented in commercial and scientific HPC software.