Relaxed Consistency and Coherence Granularity in DSM Systems: A Performance Evaluation
During the past few years, two main approaches have been taken to
improve the performance of software shared memory implementations:
relaxing consistency models and providing fine-grained access control.
Their performance tradeoffs, however, are not well understood. This
paper compares the performance of three consistency models across
four coherence granularity sizes on a 16-node cluster of
The main result
is that no single combination of protocol, granularity, and mechanism
to handle message arrival performs best for all applications.
The experiments with five classes of applications show
that the combination of the sequential consistency (SC) protocol, a
polling mechanism, and very fine (64-byte) coherence granularity works
quite well for most applications. As the coherence granularity increases,
single-writer lazy release consistency (SW-LRC) and home-based lazy
release consistency (HLRC) perform better than the SC protocol for
applications with false sharing and coarse-grain
synchronization. However, the absolute performance with
a coarse coherence granularity depends upon whether the application causes
significant internal fragmentation.
The HLRC protocol works better than SW-LRC, when the size of
coherence granularity increases to a typical page size (4,096 bytes).
The polling mechanism is very crucial for the SC protocol and not
crucial for SW-LRC and HLRC protocols for certain applications.
This technical report has been published as
- Relaxed Consistency and Coherence Granularity in DSM Systems: A
Performance Evaluation. Y. Zhou, L. Iftode,
J.P. Singh and K. Li, B.R. Toonen, I. Schoinas, M.D.
Hill and D.A. Wood, Proceedings of the 6th ACM
Symp. on Principles and Practice of Parallel
Programming June 1997.