MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

The complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically not transparent to scientific programmers and system architects. Therefore, pred...

Full description

Saved in:
Bibliographic Details
Published inSC16: International Conference for High Performance Computing, Networking, Storage and Analysis pp. 526 - 537
Main Authors Grass, Thomas, Allande, Cesar, Armejach, Adria, Rico, Alejandro, Ayguade, Eduard, Labarta, Jesus, Valero, Mateo, Casas, Marc, Moreto, Miquel
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically not transparent to scientific programmers and system architects. Therefore, predicting the behavior of current scientific applications on future HPC infrastructures is a challenging task. In this paper we present MUSA, an end-to-end methodology that employs a multi-level simulation infrastructure. By combining different levels of abstraction, MUSA is able to model the communication network, microarchitectural details and system software interactions, providing different trade-offs in terms of simulation cost and accuracy. We compare detailed MUSA simulations with native executions of up to 2,048 cores and find relative errors that are within 10% in the common case. In addition, we use MUSA to simulate up to 16,384 cores and successfully identify scalability bottlenecks due to different factors, e.g. memory contention or load imbalance. We also compare different system configurations, showing how MUSA can help system designers to assess the usefulness of future technologies in next-generation HPC machines.
ISSN:2167-4337
DOI:10.1109/SC.2016.44