PGI compiler with GPU support

Site:
http://www.pgroup.com

The Portland Group, Inc (PGI) is a long-time provider of compilers that focus on the HPC user community. PGI 2010 includes the PGI Accelerator Fortran and C99 compilers supporting x64+NVIDIA systems running under Linux, Mac OS X and Windows; PGFORTRAN and PGCC accelerator compilers are supported on all Intel and AMD x64 processor-based systems with CUDA-enabled NVIDIA GPUs.

CUDA is the architecture of the NVIDIA line of GPUs. Currently, the CUDA programming environment is comprised of an extended C compiler and tool chain, known as CUDA C. CUDA C allows direct programming of the GPU from a high level language. Third party wrappers are also available for Python, Perl, Fortran, Java, Ruby, Lua, and MATLAB. The PGI compiler includes support for CUDA Fortran on Linux, Mac OS X and Windows.

GPU designs are optimized for the computations found in graphics rendering, but are general enough to be useful in many data-parallel, compute-intensive programs common in high-performance computing (HPC). CUDA supports four key abstractions: cooperating threads organized into thread groups, shared memory and barrier synchronization within thread groups, and coordinated independent thread groups organized into a grid. A CUDA programmer is required to partition the program into coarse grain blocks that can be executed in parallel. Each block is partitioned into fine grain threads, which can cooperate using shared memory and barrier synchronization. A properly designed CUDA program will run on any CUDA-enabled GPU, regardless of the number of available processor cores

The PGI Accelerator Programming model does “for GPU programming what OpenMP did for thread programming.” Programmers need only add directives to C and Fortran codes, and the compiler does the rest (but one may still need to dig in there and help things along to get the best performance).

The advantages of the PGI Accelerator Model include:
  • Minimal changes to the language – directives/pragmas, in the same vein as vector or OpenMP parallel directives
  • Minimal library calls – usually none
  • Standard x64 toolchain – no changes to makefiles, linkers, build process, standard libraries, other tools
  • Binaries will execute on any compatible x64+GPU hardware system
  • PGI Unified Binary Technology – ensures continued portability to non GPU-enabled targets
  • One Cross-platform HPC Development Environment
  • One Integrated Suite of Parallel Compilers & Tools

However using tools like CUDA on NVIDIA’s GPUs requires substantial effort on the part of application developers who now must explicitly manage the transfer of data to the processors of the GPU, fetching of the answer from the GPU, and restructuring of operations to take advantage of the various levels of parallel processing within the hardware (both vector and multiprocessor). OpenCL has the potential to be supported cross-platform, while CUDA is limited to NVIDIA products.

Applications:
PGI is the compiler-of-choice among many popular performance-critical applications used in the fields of geophysical modeling, mechanical engineering, computational chemistry, weather forecasting, and high-energy physics. Leading commercial applications built with PGI compilers and tools include ANSYS, ADINA, AVL Fire, POLYFLOW, STAR-CD, LS-DYNA, RADIOSS, PAM-CRASH and GAUSSIAN. Leading community research applications including AMBER, BLAST, CAM, CHARMM, GAMESS, MCNP5, MM5, MOLPRO, MOM4, POP and WRF2 are built and tested by PGI with each release of the PGI compilers and tools.

With companies integrating GPU hardware into their solutions, and other companies developing tools to make the GPUs themselves easier to use, GPUs are starting to benefit from a real network effect.

License:
Proprietary

More information:
http://www.pgroup.com/lit/presentations/pgi-acc-ieee.pdf
http://insidehpc.com/2009/07/20/pgi-compiler-9-x64-gpu-hybrid-programming/