HEPnOS: a Specialized Data Service for High Energy Physics Analysis
In this paper, we present HEPnOS, a distributed data service for managing data produced by high-energy physics (HEP) experiments. Using HEPnOS, HEP applications can use HPC resources more efficiently than traditional file-based applications. The file-based model leads to a rigid, chunk-based allocat...
Saved in:
Published in | 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) pp. 637 - 646 |
---|---|
Main Authors | , , , , , , , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.05.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we present HEPnOS, a distributed data service for managing data produced by high-energy physics (HEP) experiments. Using HEPnOS, HEP applications can use HPC resources more efficiently than traditional file-based applications. The file-based model leads to a rigid, chunk-based allocation of computational resources and limits the number of cores that can be used concurrently by an HEP application. The fundamental problem is that organizing domain-specific data into files inadvertently introduces a single, artificial, conflated tuning parameter that puts key optimization goals into conflict: larger file sizes reduce metadata overhead and thus improve I/O efficiency, but smaller file sizes provide more opportunity for workflow parallelism and load balancing. In this work, we introduce a domain-specific data service that decouples that constraint so that data can be accessed and processed in its natural granularity while still maintaining I/O efficiency. By removing the constraints introduced by file handling we are able to obtain better scaling and make efficient use of more cores for processing a fixed-sized data sample. We demonstrate the improved scalability by using an application developed in the file-based paradigm and comparing it to a version modified to use HEPnOS. |
---|---|
DOI: | 10.1109/IPDPSW59300.2023.00108 |