Advances in Grid Computing for the Fabric for Frontier Experiments Project at Fermilab
The Fabric for Frontier Experiments (FIFE) project is a major initiative within the Fermilab Scientific Computing Division charged with leading the computing model for Fermilab experiments. Work within the FIFE project creates close collaboration between experimenters and computing professionals to...
Saved in:
Published in | Journal of physics. Conference series Vol. 898; no. 5; pp. 52026 - 52033 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Bristol
IOP Publishing
01.10.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The Fabric for Frontier Experiments (FIFE) project is a major initiative within the Fermilab Scientific Computing Division charged with leading the computing model for Fermilab experiments. Work within the FIFE project creates close collaboration between experimenters and computing professionals to serve high-energy physics experiments of differing size, scope, and physics area. The FIFE project has worked to develop common tools for job submission, certificate management, software and reference data distribution through CVMFS repositories, robust data transfer, job monitoring, and databases for project tracking. Since the projects inception the experiments under the FIFE umbrella have significantly matured, and present an increasingly complex list of requirements to service providers. To meet these requirements, the FIFE project has been involved in transitioning the Fermilab General Purpose Grid cluster to support a partitionable slot model, expanding the resources available to experiments via the Open Science Grid, assisting with commissioning dedicated high-throughput computing resources for individual experiments, supporting the efforts of the HEP Cloud projects to provision a variety of back end resources, including public clouds and high performance computers, and developing rapid onboarding procedures for new experiments and collaborations. The larger demands also require enhanced job monitoring tools, which the project has developed using such tools as ElasticSearch and Grafana. in helping experiments manage their large-scale production workflows. This group in turn requires a structured service to facilitate smooth management of experiment requests, which FIFE provides in the form of the Production Operations Management Service (POMS). POMS is designed to track and manage requests from the FIFE experiments to run particular workflows, and support troubleshooting and triage in case of problems. Recently a new certificate management infrastructure called Distributed Computing Access with Federated Identities (DCAFI) has been put in place that has eliminated our dependence on a Fermilab-specific third-party Certificate Authority service and better accommodates FIFE collaborators without a Fermilab Kerberos account. DCAFI integrates the existing InCommon federated identity infrastructure, CILogon Basic CA, and a MyProxy service using a new general purpose open source tool. We will discuss the general FIFE onboarding strategy, progress in expanding FIFE experiments presence on the Open Science Grid, new tools for job monitoring, the POMS service, and the DCAFI project. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/898/5/052026 |