GPI - Global Address Space Programming Interface

Before you continue, have a look at the second generation version: GPI-2.

The usage of PGAS models have been discussed as an MPI alternative for a while now. GPI is the first implementation of the Fraunhofer Institut ITWM that, instead of a new language alternative, presents itself as an API for C, C++ and Fortran. GPI has already been successfully used in industrial applications. Different benchmarks implemented with GPI show how applications scale well even when using highly-parallel Multicore systems, and without the known scaling problems of MPI.



Industrial and scientific applications are processing more and more data requiring increasingly larger machines. Modern computer architectures are now multi-core: processing systems composed of independent cores.

This implies a paradigm shift on the development of software where developers must exploit the inherent parallelism of these architectures to be able to scale their applications.

At the same time, it is essential to feed all the cores with a contiguous data stream, just at that time when the computation actually needs them and to efficiently share the work among the cores. Most suitable, data transfers should not charge the CPU.

We face these problems with two easy to use, robust and scalable APIs staying consistently in the same programming model. On the cluster level, our Global address space Programming Interface (GPI) provides a low latency communication library which works at full wirespeed. On the node level, our Multi Core Threading Package (MCTP) library supplements GPI with an optimized threading library which has full hardware awareness.


The Global address space Programming Interface (GPI) is a low latency communication library and a runtime system for scalable real-time parallel applications running on distributed systems with an IB or 10GE interconnect. It provides a partitioned global address space (PGAS) to the application which in turn has direct and full access to a remote data location. The whole functionality includes communication primitives, environment runtime checks, synchronization primitives such as fast barriers or global atomic counters, all which allow the development of parallel programs for large scale.

Focused on performance, by leveraging on the network interconnect hardware (wire-speed), it minimizes the communication overhead with overlap of communication and computation (true asynchronous communications).

The GPI provides a simple, reliable runtime system to handle large datasets, dynamic and irregular applications that are I/O and compute intensive.


The MCTP supplements GPI on the node level. Software developers can no longer rely on an increasing clock speed. Multithreaded design is neccessary in order to achieve scalable applications.

The MCTP is a threading package to make multi-threading programming slightly more programmer friendly. It abstracts the native threads of the platform and provides complete state-of-the-art functionality to work with threads, threadpools, critical section and related topics such as synchronization or high frequency timer.

One of the key features of MCTP, clearly distinguishing it from other threading libraries, is its full hardware awareness including NUMA layout or cache to core mapping.