Compiling data-parallel programs for clusters of SMPs
Clusters of shared‐memory multiprocessors (SMPs) have become the most promising parallel computing platforms for scientific computing. However, SMP clusters significantly increase the complexity of user application development when using the low‐level application programming interfaces MPI and OpenM...
Saved in:
Published in | Concurrency and computation Vol. 16; no. 2-3; pp. 111 - 132 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Chichester, UK
John Wiley & Sons, Ltd
01.02.2004
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Clusters of shared‐memory multiprocessors (SMPs) have become the most promising parallel computing platforms for scientific computing. However, SMP clusters significantly increase the complexity of user application development when using the low‐level application programming interfaces MPI and OpenMP, forcing users to deal with both distributed‐memory and shared‐memory parallelization details. In this paper we present extensions of High Performance Fortran (HPF) for SMP clusters which enable the compiler to adopt a hybrid parallelization strategy, efficiently combining distributed‐memory with shared‐memory parallelism. By means of a small set of new language features, the hierarchical structure of SMP clusters may be specified. This information is utilized by the compiler to derive inter‐node data mappings for controlling distributed‐memory parallelization across the nodes of a cluster and intra‐node data mappings for extracting shared‐memory parallelism within nodes. Additional mechanisms are proposed for specifying inter‐ and intra‐node data mappings explicitly, for controlling specific shared‐memory parallelization issues and for integrating OpenMP routines in HPF applications. The proposed features have been realized within the ADAPTOR and VFC compilers. The parallelization strategy for clusters of SMPs adopted by these compilers is discussed as well as a hybrid‐parallel execution model based on a combination of MPI and OpenMP. Experimental results indicate the effectiveness of the proposed features. Copyright © 2004 John Wiley & Sons, Ltd. |
---|---|
Bibliography: | NEC Europe Ltd istex:759B6CF80AC41D46E52C0B11E0D9B083072F24C7 Austrian Science Fund ArticleID:CPE767 ark:/67375/WNG-Z7DB48FZ-D ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.767 |