CGD Library Primitives

Types

The library can work with arbitrary global datastructure types as long as a range type is defined to represent datastructure subdomains:

Range Type representing a domain of a global datastructure U
DataDatastructure type decomposable by Range domainsU
Partition <Range> Decomposition type assigning a domain to each PE L
MPartition <Range> Decomposition type assigning a list of domains to each PEL
Swap <Range>Describes a redistribution operationL
  • U tagged types are user defined
  • L tagged types are template types defined by the library
  • A decomposition/partition may consist of overlapping domains
  • A redistribution/swap type contains for each PE which domains/ranges have to be sent and received to redistribute a datastructure

As previously mentioned a SPMD program consists of sequential computations and distributed datastructure allocation and redistribution primitives. When parallel computations are written in the CGD language only type definitions, sequential computations, and helper functions are written in C++.

Datastructure Allocation

Data (T, A) Allocates datastructure A of type T
DataRa (T, A, R) Allocates global datastructure A of type T for domain R
  • Datastructures can be allocated without being decomposed
  • A datastructure should be allocated for the largest domain it will hold

Datastructure Redistribution

A global datastructure can be transformed between two decompositions using two primitives:

swapBegin (Swap S, Data A, Data B, int id, int pe) Start redistribution of A to B according to S
swapEnd (Swap S, Data A, Data B, int id, int pe) Finalizes redistribution
computeSwap (Partition Pa, Partition Pb, Swap S) Computes swap that transforms decomposition A to B

Swaps:

  • Datastructures A, B can be the same
  • pe represents the current PE number
  • overlapping swaps are possible when different id are used. Max id number is user defined.
  • If S describes redistribution from decomposition Pa to Pb
    • A should hold data for domain Pa [pe]
    • B should be able to hold data for domain Pb [pe]

Implementations:

  • message passing : MPI, SHMEM
    • begin marshalls and sends messages
    • end receives messages and unmarshalls
    • asynchronous
  • CC-SAS
    • begin waits for availability, copy between datastructures
    • ends waits everything to be done
    • no buffering

Computations

Most work is done by functions taking subdomains of global datastructures as arguments:

fname (Arg1, Arg2, ...) Sequential computation taking in, out, and inout arguments

Global datastructure argument:

  • its local domain should be supplied as another argument
  • it should be allocated for the given domain outside the function call
    • exception: domain is computed within the function
  • can be used in read-only, write-only or read-write mode

Helper functions

The following helper functions are implemented for a global datastructure type:

copy (Data A, Range Ra, Data B, Range Rb) Copy A [Ra] to B [Rb]
R
sizeBytes (Data A, Range R) Number of bytes needed to marshall  A[R]R
toBytes (char* buf, Data A, Range R) Write a representation of  A[R] to bufR
fromBytes (char* buf, Data A, Range R) Read a representation of A[R] from bufR
ckalloc (A, R) Allocate datastructure A for domain RA
print (f, A, R) Print A[R] to file fO
  • R tagged functions are needed by redistribution operations / swaps
  • A tagged functions are needed by data allocation macros
  • O tagged functions are optional

If a range type is used to automatically compute a redistribution with computeSwap the following functions are needed:

intersect (Range Ra, Range Rb, Range Rc) Compute intersection of two domains
noElements (Range R) Get domain element count

For N-dimensional arrays and other basic types the library defines range and data types, and all required helper functions.