Automatic Integration Testbeds validation on Open Science Grid

A recurring challenge in deploying high quality production middleware is the extent to which realistic testing occurs before release of the software into the production environment. We describe here an automated system for validating releases of the Open Science Grid software stack that leverages th...

Full description

Saved in:

Bibliographic Details
Published in	Journal of physics. Conference series Vol. 331; no. 6; pp. 062027 - 6
Main Authors	Caballero, J, Thapa, S, Gardner, R, Potekhin, M
Format	Journal Article
Language	English
Published	Bristol IOP Publishing 23.12.2011
Subjects	Automation Computer programs Failure modes Management systems Middleware Monitors Physics Software Statistics Test stands Workflow
Online Access	Get full text
ISSN	1742-6596 1742-6588 1742-6596
DOI	10.1088/1742-6596/331/6/062027

Cover

More Information
Summary:	A recurring challenge in deploying high quality production middleware is the extent to which realistic testing occurs before release of the software into the production environment. We describe here an automated system for validating releases of the Open Science Grid software stack that leverages the (pilot-based) PanDA job management system developed and used by the ATLAS experiment. The system was motivated by a desire to subject the OSG Integration Testbed to more realistic validation tests. In particular those which resemble to every extent possible actual job workflows used by the experiments thus utilizing job scheduling at the compute element (CE), use of the worker node execution environment, transfer of data to/from the local storage element (SE), etc. The context is that candidate releases of OSG compute and storage elements can be tested by injecting large numbers of synthetic jobs varying in complexity and coverage of services tested. The native capabilities of the PanDA system can thus be used to define jobs, monitor their execution, and archive the resulting run statistics including success and failure modes. A repository of generic workflows and job types to measure various metrics of interest has been created. A command-line toolset has been developed so that testbed managers can quickly submit "VO-like" jobs into the system when newly deployed services are ready for testing. A system for automatic submission has been crafted to send jobs to integration testbed sites, collecting the results in a central service and generating regular reports for performance and reliability.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1742-6596 1742-6588 1742-6596
DOI:	10.1088/1742-6596/331/6/062027