A message passing benchmark for unbalanced applications

We present a distributed memory parallel implementation of the unbalanced tree search (UTS) benchmark using MPI and investigate MPI’s ability to efficiently support irregular and nested parallelism through continuous dynamic load balancing. Two load balancing methods are explored: work sharing using...

Full description

Saved in:
Bibliographic Details
Published inSimulation modelling practice and theory Vol. 16; no. 9; pp. 1177 - 1189
Main Authors Dinan, James, Olivier, Stephen, Sabin, Gerald, Prins, Jan, Sadayappan, P., Tseng, Chau-Wen
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.10.2008
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present a distributed memory parallel implementation of the unbalanced tree search (UTS) benchmark using MPI and investigate MPI’s ability to efficiently support irregular and nested parallelism through continuous dynamic load balancing. Two load balancing methods are explored: work sharing using a centralized work server and distributed work stealing using explicit polling to service steal requests. Experiments indicate that in addition to a parameter defining the granularity of load balancing, message-passing paradigms require additional techniques to manage the volume of communication and mitigate runtime overhead. Using additional parameters, we observed an improvement of up to 3–4X in parallel performance. We report results for three distributed memory parallel computer systems and use UTS to characterize the performance and scalability on these systems. Overall, we find that the simpler work sharing approach with a single work server achieves good performance on hundreds of processors and that our distributed work stealing implementation scales to thousands of processors and delivers more robust performance that is less sensitive to the particular workload and load balancing parameters.
ISSN:1569-190X
1878-1462
DOI:10.1016/j.simpat.2008.06.004