On-line detection of large-scale parallel application's structure

With larger and larger systems being constantly deployed, trace-based performance analysis of parallel applications has become a challenging task. Even if the amount of performance data gathered per single process is small, traces rapidly become unmanageable when merging together the information col...

Full description

Saved in:

Bibliographic Details
Published in	2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) pp. 1 - 10
Main Authors	Llort, German, Gonzalez, Juan, Servat, Harald, Gimenez, Judit, Labarta, Jesus
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2010
Subjects	Availability Computational intelligence Degradation Delay Filtering Information analysis Large-scale systems Merging Performance analysis Runtime
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With larger and larger systems being constantly deployed, trace-based performance analysis of parallel applications has become a challenging task. Even if the amount of performance data gathered per single process is small, traces rapidly become unmanageable when merging together the information collected from all processes. In general, an efficient analysis of such a large volume of data is subject to a previous filtering step that directs the analyst's attention towards what is meaningful to understand the observed application behavior. Furthermore, the iterative nature of most scientific applications usually ends up producing repetitive information. Discarding irrelevant data aims at reducing both the size of traces, and the time required to perform the analysis and deliver results. In this paper, we present an on-line analysis framework that relies on clustering techniques to intelligently select the most relevant information to understand how the application behaves, while keeping the volume of performance data at a reasonable size.
ISBN:	1424464420 9781424464425
ISSN:	1530-2075
DOI:	10.1109/IPDPS.2010.5470350