Network Interface Support for Shared Virtual Memory on Clusters | Computer Science Department at Princeton University

Report ID:

TR-579-98

Authors:

Bilas, Angelos / Liao, Cheng / Singh, Jaswinder Pal

Date:

February 1998

Pages:

Download Formats:

[Postscript]

Abstract:

Clusters of symmetric multiprocessors (SMPs) are important platforms for
high
performance computing. A lot of research has been done in building
network interconnects and communication layers that deliver
low-latency and high-bandwidth communication to the user. With the
success of hardware cache-coherent distributed shared memory (DSM), a
lot of effort has also been made to support the coherent shared
address space programming model in software on clusters. However,
performance is still far from that achieved on hardware DSM systems.

In this work we investigate the use of mechanisms in the software
communication layer and the underlying network interface to
substantially enhance the performance of shared virtual memory (SVM)
on clusters of SMPs. We use a real implementation with a programmable
network interface as our prototype, but our extensions are
general-purpose and can be provided by network interfaces that do not
employ a programmable processor.

We examine how the protocol layer can take advantage of each mechanism
in the communication layer and be restructured accordingly. The final
protocol (SVM-NI) eliminates the need for interrupts and asynchronous
protocol handling. For each mechanism, we evaluate the impact on the
end performance of ten applications with widely varying
characteristics. We demonstrate that substantial improvements in
performance can indeed be achieved, and find that different
applications need different mechanisms among the ones we use.
Application performance improves up to 50\% for applications that end
up with reasonably good speedups; individual components of execution
time targeted by each mechanism are reduced by even higher
percentages. Finally, we use a firmware performance monitor,
integrated with the communication layer, to understand the drawbacks
of the system, to identify interesting tradeoffs in the protocol layer
for future exploration, and to identify the remaining bottlenecks in
SVM performance that should be addressed next.