Quick links

Efficient Checkpointing on MIMD Architectures (thesis)

Report ID:
May 1993
Download Formats:


Presented here are efficient algorithms for checkpointing on MIMD
architectures. These algorithms have been implemented on two
representative machines: a shared-memory multiprocessor, and a
message-passing multicomputer. The algorithms and implementations are
evaluated according to three speed metrics: checkpoint time, overhead,
and latency. Checkpointing is important as a general means of
software fault-tolerance. It is also the backbone of certain program
control utilities, such as job-swapping, process migration, and
playback debugging. We employ several techniques to minimize the
invasiveness of the checkpointer on the target program. Such
techniques are main memory checkpointing, copy-on-write, buffering,
compression, and the elimination of bottlenecks and extra control
messages. The major result of this dissertation is that we can
implement efficient checkpointing on MIMD architectures, thereby
enhancing the usability of such machines.

Follow us: Facebook Twitter Linkedin