An On-the-Fly Method to Exchange Vector Clocks in Distributed-Memory Programs

Vector clocks are logical timestamps used in correctness tools to analyze the happened-before relation between events in parallel program executions. In particular, race detectors use them to find concurrent conflicting memory accesses, and replay tools use them to reproduce or find alternative exec...

Full description

Saved in:
Bibliographic Details
Published in2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) pp. 530 - 540
Main Authors Schwitanski, Simon, Tomski, Felix, Protze, Joachim, Terboven, Christian, Muller, Matthias S.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Vector clocks are logical timestamps used in correctness tools to analyze the happened-before relation between events in parallel program executions. In particular, race detectors use them to find concurrent conflicting memory accesses, and replay tools use them to reproduce or find alternative execution paths. To record the happened-before relation with vector clocks, tool developers have to consider the different synchronization concepts of a programming model, e.g., barriers, locks, or message exchanges. Especially in distributed-memory programs, various concepts result in explicit and implicit synchronization between processes. Previously implemented vector clock exchanges are often specific to a single programming model, and a translation to other programming models is not trivial. Consequently, analyses relying on the vector clock exchange remain model-specific. This paper proposes an abstraction layer for on-the-fly vector clock exchanges for distributed-memory programs. Based on the programming models MPI, OpenSHMEM, and GASPI, we define common synchronization primitives and explain how model-specific procedures map to our model-agnostic abstraction layer. The exchange model is general enough also to support synchronization concepts of other parallel programming models. We present our implementation of the vector clock abstraction layer based on the Generic Tool Infrastructure with translators for MPI and OpenSHMEM. In an overhead study using the SPEC MPI 2007 benchmarks, the slowdown of the implemented vector clock exchange ranges from 1.1x to 12.6x for runs with up to 768 processes.
ISBN:9781665497480
DOI:10.1109/IPDPSW55747.2022.00093